29 research outputs found

    Precedence-constrained scheduling problems parameterized by partial order width

    Full text link
    Negatively answering a question posed by Mnich and Wiese (Math. Program. 154(1-2):533-562), we show that P2|prec,pj{1,2}p_j{\in}\{1,2\}|CmaxC_{\max}, the problem of finding a non-preemptive minimum-makespan schedule for precedence-constrained jobs of lengths 1 and 2 on two parallel identical machines, is W[2]-hard parameterized by the width of the partial order giving the precedence constraints. To this end, we show that Shuffle Product, the problem of deciding whether a given word can be obtained by interleaving the letters of kk other given words, is W[2]-hard parameterized by kk, thus additionally answering a question posed by Rizzi and Vialette (CSR 2013). Finally, refining a geometric algorithm due to Servakh (Diskretn. Anal. Issled. Oper. 7(1):75-82), we show that the more general Resource-Constrained Project Scheduling problem is fixed-parameter tractable parameterized by the partial order width combined with the maximum allowed difference between the earliest possible and factual starting time of a job.Comment: 14 pages plus appendi

    Looking at Vector Space and Language Models for IR using Density Matrices

    Full text link
    In this work, we conduct a joint analysis of both Vector Space and Language Models for IR using the mathematical framework of Quantum Theory. We shed light on how both models allocate the space of density matrices. A density matrix is shown to be a general representational tool capable of leveraging capabilities of both VSM and LM representations thus paving the way for a new generation of retrieval models. We analyze the possible implications suggested by our findings.Comment: In Proceedings of Quantum Interaction 201

    Predicting sample size required for classification performance

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Supervised learning methods need annotated data in order to generate efficient models. Annotated data, however, is a relatively scarce resource and can be expensive to obtain. For both passive and active learning methods, there is a need to estimate the size of the annotated sample required to reach a performance target.</p> <p>Methods</p> <p>We designed and implemented a method that fits an inverse power law model to points of a given learning curve created using a small annotated training set. Fitting is carried out using nonlinear weighted least squares optimization. The fitted model is then used to predict the classifier's performance and confidence interval for larger sample sizes. For evaluation, the nonlinear weighted curve fitting method was applied to a set of learning curves generated using clinical text and waveform classification tasks with active and passive sampling methods, and predictions were validated using standard goodness of fit measures. As control we used an un-weighted fitting method.</p> <p>Results</p> <p>A total of 568 models were fitted and the model predictions were compared with the observed performances. Depending on the data set and sampling method, it took between 80 to 560 annotated samples to achieve mean average and root mean squared error below 0.01. Results also show that our weighted fitting method outperformed the baseline un-weighted method (p < 0.05).</p> <p>Conclusions</p> <p>This paper describes a simple and effective sample size prediction algorithm that conducts weighted fitting of learning curves. The algorithm outperformed an un-weighted algorithm described in previous literature. It can help researchers determine annotation sample size for supervised machine learning.</p

    In silico approach to screen compounds active against parasitic nematodes of major socio-economic importance

    Get PDF
    Infections due to parasitic nematodes are common causes of morbidity and fatality around the world especially in developing nations. At present however, there are only three major classes of drugs for treating human nematode infections. Additionally the scientific knowledge on the mechanism of action and the reason for the resistance to these drugs is poorly understood. Commercial incentives to design drugs that are endemic to developing countries are limited therefore, virtual screening in academic settings can play a vital role is discovering novel drugs useful against neglected diseases. In this study we propose to build robust machine learning model to classify and screen compounds active against parasitic nematodes.A set of compounds active against parasitic nematodes were collated from various literature sources including PubChem while the inactive set was derived from DrugBank database. The support vector machine (SVM) algorithm was used for model development, and stratified ten-fold cross validation was used to evaluate the performance of each classifier. The best results were obtained using the radial basis function kernel. The SVM method achieved an accuracy of 81.79% on an independent test set. Using the model developed above, we were able to indentify novel compounds with potential anthelmintic activity.In this study, we successfully present the SVM approach for predicting compounds active against parasitic nematodes which suggests the effectiveness of computational approaches for antiparasitic drug discovery. Although, the accuracy obtained is lower than the previously reported in a similar study but we believe that our model is more robust because we intentionally employed stringent criteria to select inactive dataset thus making it difficult for the model to classify compounds. The method presents an alternative approach to the existing traditional methods and may be useful for predicting hitherto novel anthelmintic compounds.12 page(s

    Unshuffling Permutations

    Get PDF
    International audienceA permutation is said to be a square if it can be obtained by shuffling two order-isomorphic patterns. The definition is intended to be the natural counterpart to the ordinary shuffle of words and languages. In this paper, we tackle the problem of recognizing square permutations from both the point of view of algebra and algorithms. On the one hand, we present some algebraic and combinatorial properties of the shuffle product of permutations. We follow an unusual line consisting in defining the shuffle of permutations by means of an unshuffling operator, known as a coproduct. This strategy allows to obtain easy proofs for algebraic and combinatorial properties of our shuffle product. We besides exhibit a bijection between square (213, 231)-avoiding permutations and square binary words. On the other hand, by using a pattern avoidance criterion on oriented perfect matchings, we prove that recognizing square permutations is NP-complete
    corecore